Goto

Collaborating Authors

 causal estimation



A Critical Look at the Consistency of Causal Estimation with Deep Latent Variable Models

Neural Information Processing Systems

Using deep latent variable models in causal inference has attracted considerable interest recently, but an essential open question is their ability to yield consistent causal estimates. While they have demonstrated promising results and theory exists on some simple model formulations, we also know that causal effects are not even identifiable in general with latent variables. We investigate this gap between theory and empirical results with analytical considerations and extensive experiments under multiple synthetic and real-world data sets, using the causal effect variational autoencoder (CEVAE) as a case study. While CEVAE seems to work reliably under some simple scenarios, it does not estimate the causal effect correctly with a misspecified latent variable or a complex data distribution, as opposed to its original motivation. Hence, our results show that more attention should be paid to ensuring the correctness of causal estimates with deep latent variable models.


Using LLMs to Directly Guess Conditional Expectations Can Improve Efficiency in Causal Estimation

Engh, Chris, Aronow, P. M.

arXiv.org Artificial Intelligence

We propose a simple yet effective use of LLM-powered AI tools to improve causal estimation. In double machine learning, the accuracy of causal estimates of the effect of a treatment on an outcome in the presence of a high-dimensional confounder depends on the performance of estimators of conditional expectation functions. We show that predictions made by generative models trained on historical data can be used to improve the performance of these estimators relative to approaches that solely rely on adjusting for embeddings extracted from these models. We argue that the historical knowledge and reasoning capacities associated with these generative models can help overcome curse-of-dimensionality problems in causal inference problems. We consider a case study using a small dataset of online jewelry auctions, and demonstrate that inclusion of LLM-generated guesses as predictors can improve efficiency in estimation.



A Generative Framework for Causal Estimation via Importance-Weighted Diffusion Distillation

Song, Xinran, Chen, Tianyu, Zhou, Mingyuan

arXiv.org Machine Learning

Estimating individualized treatment effects from observational data is a central challenge in causal inference, largely due to covariate imbalance and confounding bias from non-randomized treatment assignment. While inverse probability weighting (IPW) is a well-established solution to this problem, its integration into modern deep learning frameworks remains limited. In this work, we propose Importance-Weighted Diffusion Distillation (IWDD), a novel generative framework that combines the pretraining of diffusion models with importance-weighted score distillation to enable accurate and fast causal estimation-including potential outcome prediction and treatment effect estimation. We demonstrate how IPW can be naturally incorporated into the distillation of pretrained diffusion models, and further introduce a randomization-based adjustment that eliminates the need to compute IPW explicitly-thereby simplifying computation and, more importantly, provably reducing the variance of gradient estimates. Empirical results show that IWDD achieves state-of-the-art out-of-sample prediction performance, with the highest win rates compared to other baselines, significantly improving causal estimation and supporting the development of individualized treatment strategies. We will release our PyTorch code for reproducibility and future research.


Estimating Causal Effects of Text Interventions Leveraging LLMs

Guo, Siyi, Marmarelis, Myrl G., Morstatter, Fred, Lerman, Kristina

arXiv.org Artificial Intelligence

Quantifying the effect of textual interventions in social systems, such as reducing anger in social media posts to see its impact on engagement, poses significant challenges. Direct interventions on real-world systems are often infeasible, necessitating reliance on observational data. Traditional causal inference methods, typically designed for binary or discrete treatments, are inadequate for handling the complex, high-dimensional nature of textual data. This paper addresses these challenges by proposing a novel approach, CausalDANN, to estimate causal effects using text transformations facilitated by large language models (LLMs). Unlike existing methods, our approach accommodates arbitrary textual interventions and leverages text-level classifiers with domain adaptation ability to produce robust effect estimates against domain shifts, even when only the control group is observed. This flexibility in handling various text interventions is a key advancement in causal estimation for textual data, offering opportunities to better understand human behaviors and develop effective policies within social systems.


A Critical Look at the Consistency of Causal Estimation with Deep Latent Variable Models

Neural Information Processing Systems

Using deep latent variable models in causal inference has attracted considerable interest recently, but an essential open question is their ability to yield consistent causal estimates. While they have demonstrated promising results and theory exists on some simple model formulations, we also know that causal effects are not even identifiable in general with latent variables. We investigate this gap between theory and empirical results with analytical considerations and extensive experiments under multiple synthetic and real-world data sets, using the causal effect variational autoencoder (CEVAE) as a case study. While CEVAE seems to work reliably under some simple scenarios, it does not estimate the causal effect correctly with a misspecified latent variable or a complex data distribution, as opposed to its original motivation. Hence, our results show that more attention should be paid to ensuring the correctness of causal estimates with deep latent variable models.


Causal Inference from Text: Unveiling Interactions between Variables

Zhou, Yuxiang, He, Yulan

arXiv.org Artificial Intelligence

Adjusting for latent covariates is crucial for estimating causal effects from observational textual data. Most existing methods only account for confounding covariates that affect both treatment and outcome, potentially leading to biased causal effects. This bias arises from insufficient consideration of non-confounding covariates, which are relevant only to either the treatment or the outcome. In this work, we aim to mitigate the bias by unveiling interactions between different variables to disentangle the non-confounding covariates when estimating causal effects from text. The disentangling process ensures covariates only contribute to their respective objectives, enabling independence between variables. Additionally, we impose a constraint to balance representations from the treatment group and control group to alleviate selection bias. We conduct experiments on two different treatment factors under various scenarios, and the proposed model significantly outperforms recent strong baselines. Furthermore, our thorough analysis on earnings call transcripts demonstrates that our model can effectively disentangle the variables, and further investigations into real-world scenarios provide guidance for investors to make informed decisions.


Causal Estimation for Text Data with (Apparent) Overlap Violations

Gui, Lin, Veitch, Victor

arXiv.org Artificial Intelligence

Consider the problem of estimating the causal effect of some attribute of a text document; for example: what effect does writing a polite vs. rude email have on response time? To estimate a causal effect from observational data, we need to adjust for confounding aspects of the text that affect both the treatment and outcome--e.g., the topic or writing level of the text. These confounding aspects are unknown a priori, so it seems natural to adjust for the entirety of the text (e.g., using a transformer). However, causal identification and estimation procedures rely on the assumption of overlap: for all levels of the adjustment variables, there is randomness leftover so that every unit could have (not) received treatment. Since the treatment here is itself an attribute of the text, it is perfectly determined, and overlap is apparently violated. The purpose of this paper is to show how to handle causal identification and obtain robust causal estimation in the presence of apparent overlap violations. In brief, the idea is to use supervised representation learning to produce a data representation that preserves confounding information while eliminating information that is only predictive of the treatment. This representation then suffices for adjustment and satisfies overlap. Adapting results on non-parametric estimation, we find that this procedure is robust to conditional outcome misestimation, yielding a low-absolute-bias estimator with valid uncertainty quantification under weak conditions. Empirical results show strong improvements in bias and uncertainty quantification relative to the natural baseline.


Enhancing Causal Estimation through Unlabeled Offline Data

Teichner, Ron, Meir, Ron, Eitan, Danny

arXiv.org Machine Learning

Consider a situation where a new patient arrives in the Intensive Care Unit (ICU) and is monitored by multiple sensors. We wish to assess relevant unmeasured physiological variables (e.g., cardiac contractility and output and vascular resistance) that have a strong effect on the patients diagnosis and treatment. We do not have any information about this specific patient, but, extensive offline information is available about previous patients, that may only be partially related to the present patient (a case of dataset shift). This information constitutes our prior knowledge, and is both partial and approximate. The basic question is how to best use this prior knowledge, combined with online patient data, to assist in diagnosing the current patient most effectively. Our proposed approach consists of three stages: (i) Use the abundant offline data in order to create both a non-causal and a causal estimator for the relevant unmeasured physiological variables. (ii) Based on the non-causal estimator constructed, and a set of measurements from a new group of patients, we construct a causal filter that provides higher accuracy in the prediction of the hidden physiological variables for this new set of patients. (iii) For any new patient arriving in the ICU, we use the constructed filter in order to predict relevant internal variables. Overall, this strategy allows us to make use of the abundantly available offline data in order to enhance causal estimation for newly arriving patients. We demonstrate the effectiveness of this methodology on a (non-medical) real-world task, in situations where the offline data is only partially related to the new observations. We provide a mathematical analysis of the merits of the approach in a linear setting of Kalman filtering and smoothing, demonstrating its utility.